Entropy Methods in Random Motion 

Piotr Garbaczewski* 
Institute of Physics, University of Zielona Gora, 65-516 Zielona Gora, Poland 

February 2, 2008 



Abstract 

We analyze a contrasting dynamical behavior of Gibbs- Shannon and conditional KuUback-Leibler 
entropies, induced by time-evolution of continuous probability distributions. The question of pre- 
dominantly purpose-dependent entropy definition for non-equilibrium model systems is addressed. 
The conditional KuUback-Leibler entropy is often believed to properly capture physical features of 
an asymptotic approach towards equilibrium. We give arguments in favor of the usefulness of the 
standard Gibbs- type entropy and indicate that its dynamics gives an insight into physically relevant, 
but generally ignored in the literature, non-equilibrium phenomena. The role of physical units in the 
Gibbs-Shannon entropy definition is discussed. 

PACS numbers: 02.50.-r, 89.70.+C, 05.40.-a 

1 Introduction 

There are many notions of entropy. Except for the Clausius (thermodynamic) entropy, none of them may 
be considered unambiguously defined or to share the status of a physically universal quantity in the class 
of dynamical systems and phenomena, to the description of which a particular entropy notion has been 
possibly designed. 

Let us reproduce the standard (albeit non-exhaustive) list of entropies. For classical dynamical sys- 
tems one is tempted to use any of: Boltzmann, Gibbs, Shannon, KuUback-Leibler, Rcnyi, Tsallis, infor- 
mation/differential, topological, measure-theoretic and Kolmogorov-Sinai entropies. In the quantum case 
one encounters von Neumann, Wehrl and Leipnik entropies, plus more or less natural/obvious general- 
izations of, classical by provenance, KuUback-Leibler, Renyi and Tsallis entropies. The concrete entropy 
choice is with no doubt the context (classical or quantum setting, specific model system, specific notion 
of state, microstate and macrostate) and purpose-dependent. 

We shall follow associations born by non-equilibrium statistical physics phenomena, where in the time- 
dependent problems such issues like "trends" (convergence or divergence) towards stationary states plus 
Boltzmann-type theorems (temporal behavior of H-functionals), validity, limitations, possible violations, 
general rules of entropy evolution, meaning of the entropy " production" /dissipation and its temporal 
behavior. 
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The term entropy methods essentially refers to the mathematically rigorous discussion of the asymp- 
totic (large time) behavior of solutions of various partial differential equations, in particular to these 
governing the dynamics of probability densities. One attempts to quantify the speed of con(div)ergence 
of measures that allow to differentiate among different solutions and their possibly different temporal 
properties. 

To set the stage to the main theme of our considerations, let us invoke the simplest (naive) version of 
the Boltzmann H-theorem, valid in case of the rarified gas (mass m particles), without external forces, 
close to its thermal equilibrium, under an assumption of its space homogeneity, 

If the probability density function f{v) is a solution of the corresponding Boltzmann kinetic equation, 
then the Boltzmann H-function (which coincides with the negative of the Gibbs-Shannon entropy) H{t) — 
J f{v)hi f{v)dv does not increase: 

±Hit)<0. (1) 

In particular, we know that there exists an invariant (asymptotic) density /*(w) ~ exp[— m(u— uo)^/2fcBr] 
and H{t) is a constant only if / = f*{v)- 

Notice that in the one-dimensional case, the L^{R) density normalization coefficient reads {m/27rkBTy^^ 
and thence, formally, iJ, = / /* ln/»(iw = —(1/2) \n{2T:ekBT / m) where e is the base of the natural loga- 
rithm. One must be aware of an apparent dimensional difhculty, 3 , since an argument of the logarithm 
is not dimensionless. 

Clearly, a consistent integration outcome for H{t) should involve a dimensionless argument ksT /m\vY 
instead of ksT/m, provided [v] stands for any unit of velocity. Examples are [v] = Im/s (here m stands 
for the SI length unit, and not for a mass parameter) or 10^^ m/s. To this end, it suffices to redefine i/* 
as follows, 0111: 

H^^H^:^ = J f,H[v] ■ U)dv . (2) 

Multiplying /» by [v] we arrive at the dimensionless argument of the logarithm in the above. 

We shall come back later to a deeper discussion of an impact of dimensional units on the general 
definition of the Gibbs-Shannon entropy 

S{p) ^ — J p{x)\np{x)dx (3) 

for p e LH^")- 

The entropy methods basically refer to the large time asymptotic of the heat and Fokkcr-Planck 
equations, where in a mathematically oriented research all dimensional units, for the sake of clarity, are 
scaled away. Following ^ , let us consider the heat equation in the re-scaled (no physical constants) form: 
dtu = Au with X e R'\ t e and u{.,t = 0) = uq{.) > 0, / uo{x)dx = 1. 

As t ^ oo, for any u(x,t) we have u{x,t) ~ p{x,t) — (47rt)^"/^ exp[— x^/4t], in conformity with the 
standard wisdom |7] that a regular solution of the heat equation behaves asymptotically as a fundamental 
solution, once time goes to infinity. 
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There is a natural question to be addressed: what is the t ^ oo rate of convergence of the so-called 
KuUback "distance" 

\\u-p\\L^{t)= J \u{x,t)~ p{x,t)\dx (4) 

between two densities. Since, for two density functions p and p' there holds the Csiszar-KuUback inequal- 
ity, 0: 

' pHp/p')dx>{l/2)\\p~p'\\l,, (5) 



it is the KuUback-Leibler entropy 



JC{p,p')^ I p{x)\n4f\dx. (6) 



which actually stands for an upper bound upon a "distance measure" in the set of density functions. 

If we consider pt to be a solution of the heat equation with the initial data po and take Pa{x) = 
(1/V2a7r) exp[— a;^/2a], then we may always find a and k such that Pa+kt has the same second moment 
as Pt- This implies an asymptotic l/t decay of the initially prescribed Kullback-Leibler "distance", 

IC{pt, Pa+kt) < IC{po,Pa)[a/{a + kt)] . (7) 

In view of the concavity of the function /(w) = — tulnw, the Kullback-Leibler entropy is positive. 
This property if often contrasted with the fact the Gibbs-Shannon entropy S{p) may take negative values. 
Therefore, right at this point (anticipating further discussion) we introduce the conditional Kullback- 
Leibler entropy notion, which although non-positive by construction: 

nc{p,p') = -ic{p,p'), (8) 

is nonetheless one of the major tools in the study of an asymptotic convergence towards an invariant 
(equilibrium) density, |S1 . This entropy typically displays a prototype behavior (monotonic growth in 
time) , expected to hold true if the entropy definition is to be compatible with the casual understanding 
of the second law of thermodynamics, 9 . 

Now, let us consider the drifted Fokker-Planck (Smoluchowski) equation dtf — A/ — V • {bf), where 
/(., t) — /o > 0, / f[){x)dx — 1. We assume that the forward drift h — b{x, t) has a gradient form. Let /* 
be the stationary solution of the F-P equation, then an obvious question is: what is the t — > oo rate of 
convergence of ||/ — /*||li(0 = / \f{x,t) — f^.{x)\dx towards the value ? 

The outcome, albeit not completely general, is that pt decays in relative entropy to a Gaussian(Maxwellian), 
the speed of such decay is exponential, |H] . This is typically encoded in the formula, |S1 E] of the form 

Ticit) ~ exp(-at)Hc(0) , (9) 

where Tidt) = Ti-dft, f*), with a > and ft = f{x,t), t > 0. See also an explicit discussion of the 
Ornstein-Uhlcnbeck process in [ID] . 
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In the course of the time evolution, the conditional entropy monotonically approaches its maximum at 
zero, This property is seldom shared by the Gibbs-Shannon entropy of the involved time-dependent 
probability density. The Gibbs entropy may grow, diminish, oscillate and show more complicated patterns 
of behavior, 1101 ITT] . A physical relevance of such "strange" temporal properties, compare e.g. Eq. 
is worth addressing and it is our main goal in the present paper. 

2 Gibbs-Shannon and Kullback-Leibler entropies 

A casual understanding of the entropy notion in physics is that entropy (tacitly one presumes to deal 
with its thermodynamic Clausius version) is a measure of the degree of randomness and the tendency 
(trend) of physical systems to become less and less organized. We attribute a very concrete meaning to 
the term organization - namely, we are interested in quantifying how good is the probability localization 
on the state space (whatever: configuration space, velocity or phase-space) of the system. 

As a hint let us consider a probability measure /i — (/ii,/i2, ...,/iAr) on a system of N points, e. g. 
= 1- The standard Shannon entropy reads S{ii) — — X^jLi log/ij < S'(/i) < logiV and 
its maximum of corresponds to a uniform probability distribution jij = 1/N for all j. 

If A" is a discrete random variable taking values Xi with probabilities pi, i — 1,2, N, the quantity 
S{X) = — '^pilogpi is called the Shannon entropy of a discrete random variable or the entropy of 
the probability distribution (pi, ...,p]y). If X takes infinitely many values xi,X2,--- with probabilities 
Pi,P2, then the entropy S{X) is not necessarily finite. 

As a side comment we recall that log has base 2 in which case the unit of entropy is called a bit 
(binary digit), while for In with base e, the unit of entropy is called a nat (natural); we observe that 
\ogh ■ ln2 = ln6. 

For a continuous random variable X with values in x £ and the probability density p{x) one 
usually defines the Shannon entropy of a continuous random variable (called the differential entropy) 
of X) as: S{X) = — Jj, p{x) log p{x)dx , where F g i?" is the support set of X. One may also denote 
5(A) -5(p). 

There is number of standard views about the discrete and continuous entropies. In the discrete case, 
the entropy quantifies randomness in an absolute way. In the continuous case there is no smooth limiting 
passage from the discrete to continuous entropy. Then, the entropy cannot work "as it is" as a measure 
of global randomness and one usually invokes a casual list of drawbacks: S{p) may be negative, may be 
unbounded both from below and above, is scaling (hence coordinate transformation) dependent. 

Anyway, a difference of two Shannon entropies, necessarily evaluated with respect to the same co- 
ordinate system, S{p) — S{p') is known to quantify an absolute change in the information/randomness 
content when passing from p to p' and is obviously scaling independent. The same observation extends 
to the time derivative of the Shannon entropy in case of time-dependent probability densities. 

Alternatively, although with reservations, one may pass to the familiar notion of the Kullback-Leibler 
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entropy /C = Jj^p(hip— \iip')dx, non-negative and scaling-independent from the outset. However, 
one should keep in mind that it is the conditional KuUback-Leibler (K-L) entropy Tic = —K, which is 
predominantly used in the literature as a justification, in terms of model systems, of the "entropy growth 
paradigm". Like the conditional K-L entropy takes negative values and its upper bound actually 

equals zero. 

Let us point out that a consistent exploitation of the conditional K-L entropy is restricted either to 
the large time-scale phenomena, see e.g. Eq. ((TJ, or to the dynamical systems which have an invariant 
density, see Eq. I^. In the short time-scale regimes and for systems without invariant densities, the 
conditional KuUback-Leibler entropy is not an adequate tool. 

Let us consider 

Pc.,p = Pp[P{x~a)]. (10) 
where a > 0,/3 > are real parameters. The respective Shannon entropy reads: 

5(p„,^)=5(p)-ln/3. (11) 

For general probability distributions p{x) with a fixed variance a we have S[p) < iln(27recr^) and S{p) 
becomes maximized if and only if p is a Gaussian. Therefore we can write 

(27re)-i/2 exp[5(p„,^)] < a//3 (12) 

and give a meaning to the /3-scaling transformation of p{x — a): the density is broadened if /3 < 1 and 
shrinks \i (3 > 1 . 

Given a one parameter family of Gaussian densities pa — p{x — a), with the mean a £ R and the 
standard deviation fixed at a. These densities share the very same value of Shannon entropy, independent 
of a: 

S„ = iln(2W) 

If we admit the standard deviation a to be another free parameter, a two-parameter family pa — > 
Pa,iT{x) appears. Then: 

Sa' - Sa = In 

By denoting a = a{t) = \/2Dt and a' ^ cr(t') we make the non-stationary (heat kernel) density 
amenable to the "absolute comparison" formula at different time instants t' > t > 0: {a' /a) = \Jt' jt. 
Indeed a fundamental solution of the heat equation dtp ^ DAp reads 

^^"'*)=(4J^^"P("^) ^'^^ 
whose differential entropy equals S{t) — (1/2) ln(47reDt), or in the dimensionless form: S^^\t) = 
{l/2)ln{ATTeDt/[x]'^), where [x] is any dimensional unit with the SI dimension of length. 
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Let denote a convolution of a probability density p with a Gaussian probability density having 
variance v. The transition density (heat kernel) of the Wiener process generates such a convolution for 
any pq(x), with v = (j"^ = 2Dt. Then, (de Bruijn) we have the entropy accumulation formula: 

dt J p 

The monotonic growth of S(t) is paralleled by linear in time growth of the standard deviation a(t), hence 
quantifies the uncertainty (disorder) increase related to the "flattening" down of p. 

Let us consider the KuUback entropy IC{9, 9') for a family of probability densities pg labelled by a 
parameter (one or more) 9, so that the "distance" between any two densities in this family can be directly 
evaluated. We take pe' as reference probability density. Then: 

K.{9,9')^K.{pe\pe')^ ( Pe{x) \n dx . (14) 

It is particularly instructive to evaluate various K-L - " distances" among members of a two-parameter 
family of i^(i?)-normalized Gaussian functions, labelled by independent parameters 9i — a and 92 = <J 
(alternatively 6*2 = o-^) such that 9 = {9i,92)- In the self-explanatory notation, for two different 9 and 9' 
Gaussian densities there holds: 

m 0') - In - + i(4 - 1) + A - ■ (15) 
We may assume that 9' very little deviates from 9: 9' = 9 + /S.9. Then, we have 

}C{9, + A0) ~ i ^ • (16) 

where i, j, = 1, 2 and the Fisher information matrix JF^ has the form 

[ dlnpg dlnpe 
^'^^ J P'^9--^'^''- 

In case of Gaussian densities, labelled by independent 9i ~ a,92 = cr (or 6*2 = cr^) the Fisher matrix is 

diagonal. 

Let us set a' = a and consider cr^ = 2Dt, A(ct2) = 2DAt. Then S{a''^) - S{a^) ~ At/2t, while 
IC{9,9') ~ (At)2/4t2. Although, for finite increments At we have 

At 



the time derivative notion S surely can be defined for the differential entropy, but is definitely meaningless 
in terms of the corresponding short time-scale KuUback "distance", c.f. |l()l lll| . 

We stress that no such obstacle arises in the standard cautious use of the conditional KuUback entropy 
Tic, when an invariant density is in hands. Indeed, normally one of the involved densities is the stationary 
(reference) one pe'{x) = p»(x), while another is allowed to evolve in time peix) = p{x,t), t G i?^, thence 
TLcit) = ~JC{pt\p*) and dHc{t)/dt does make sense. 

We recall that for the free Brownian motion there is no invariant density. As we have indicated before, 
Eq.Q, Hcipt, Pt'), t < t' still remains a useful tool, albeit in the asymptotic regime and for not too small 
values oi t' — t. 
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3 Physical units in the entropy definition 



Let us come back to an issue of physical units in the definition of a differential entropy. In fact, if 
X and p stand for one-dimensional phase space labels and f{x,p) is a normalized phase-space density, 
J f{x,p)dxdp — 1, then the related dimensionless differential entropy reads as follows, |4|: 

Sh = -J {hf) Hhf)^ = -J f\u{hf)dxdp (18) 

where h = 2'!ih is the tentatively accepted (there is no other mention of quantum theory) Planck constant. 
Let p{x) and ph{p) be two independent , respectively spatial and momentum space densities. We form 
the joint density 

f(:x,p)^p{xrpH{p) (19) 

and evaluate the differential entropy Sh for this density. Remembering that J p{x)dx = 1 = J pii{p)dp, 
we have formally: 



J plnpdx- J ph\nphdp-lnh = + SP -Inh. (20) 



The formal use of the logarithm properties before executing integrations in J phla^hph) dp, has left us 
with an issue of "literally taking the logarithm of a dimensional argument" i. e. that of In ft,. 

We recall that Sh is a dimensionless quantity, while if x has dimensions of length, then the probability 
density has dimensions of inverse length and analogously in connection with momentum dimensions. 

Let us denote x = rdx and p = fSp where labels r and r are dimensionless, while 6x and Sp stand for 
respective position and momentum dimensional (hitherto - resolution) units. Then: 



plnpdx — hi{Sx) ^ — J plii{Sxp)dx (21) 

is a dimensionless quantity. Analogously 

- j yO?i In Ph dp -\nSp^ - J ph M^PPh) dp (22) 

is dimensionless. First left-hand-side terms in two above equations we recognize as and respectively. 
Hence, formally we have arrived at a manifestly dimensionless decomposition 

Sh^-Jp H6xp)dx - J Ph HSpph) dp + \J-^^ S^, + SI + In ^ (23) 

instead of the previous one, Eq. H20|l . The last identity Eq. H23II gives an unambiguous meaning to the 
preceding formal manipulations with dimensional quantities. Instead of the Planck constant h we can 
use any other unit with SI dimensions of action, say 5h. 

As a byproduct of our discussion, we have resolved the case of the spatially interpreted real axis, when 
X has dimensions of length, c.f. also 4 : Sg^ — — J plii(5xp)dx is the pertinent dimensionless differential 
entropy definition for spatial probability densities. 
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Example 1: Let us discuss an explicit example involving the Gauss density 

pix) = {l/aV2^) exp[-(a; - xaf/2a^] (24) 

where a is the standard deviation (its square stands for the variance). There holds S{p) = ^ln(27re(T^) 
which is a dimensionless outcome. If we pass to x with dimensions of length, then inevitably a must 
have dimensions of length. It is instructive to check that in this dimensional case we have a correct 
dimensionless result: 

to be compared with Eq. (|21|l . Clearly, Sg^ vanishes if a/6x = (27re)^^/^, hence at the dimensional value 
of the standard deviation a = {2'Ke)^^^^Sx, compare e.g. 

Example 2: Let us invoke the simplest (naive) text-book version of the Boltzmann H-theorem, valid 
in case of the rarified gas (of mass m particles), without external forces, close to its thermal equilibrium, 
under an assumption of its space homogeneity, ^E]. If the probability density function f(v) is a solution 
of the corresponding Boltzmann kinetic equation, then the Boltzmann iJ-functional (which is simply 
the negative of the differential entropy) H(t) = J f{v)\n f{v)dv does not increase: ■^H{t) < 0. In the 
present case we know that there exists an invariant (asymptotic) density, which in one-dimensional case 
has the form /*(u) = {m/2nkBTy^^ exp[—m{v — vq)^ /^.kBT]. H(t) is known to be time-independent only 
if ./ = ./*(^)- We can straightforwardly evaluate iJ* — J f* \nf^,dv — —(1/2) \n{2TTekBT/m) and become 
faced with a an apparent dimensional difficulty, jSj: an argument of the logarithm is not dimensionless. 
For sure, a consistent integration outcome for H(t) should involve kBT/'m[v]'^ instead of kBT /ra, provided 
[v\ stands for any unit of velocity. Examples are [v\ ~ Im/s (here m stands for the SI length unit, and 
not for a mass parameter) or 10"^ m/s. To this end it suffices to redefine H^, as follows, OEj: 

^ H^:^ = J ln{[v] ■ f,)dv . (26) 

Multiplying /* by [v] we arrive at the dimensionless argument of the logarithm in the above and cure the 
dimensional obstacle. 

We recall that under the scaling transformation Eq. H10|l the respective Shannon entropy takes the 
form S{pa,i3) — S{p) — ln/3. In case of Gaussian p, we get S{pa,i3) = ln[{cr/ (3)V2Tre]. Clearly, S{pa,p) 
takes the value at ct = {2TTe)^^/'^ j3 in analogy with our previous dimensional considerations. If an 
argument of p is assumed to have dimensions, then the scaling transformation with the dimensional [3 
may be interpreted as a method to restore the dimensionless differential entropy value. 

4 Temporal behavior of entropies 

4.1 Deterministic system 

Let us consider a classical dynamical system in i?" whose evolution is governed by equations of motion: 

X = fix) (27) 
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where i stands for the time derivative and / is an i?"-valued function of a; G R", x — {xi,X2, ...,x„}. 
A statistical ensemble of solutions of such dynamical equations can be described by a time-dependent 
probability density p{x, t) whose dynamics is given by the generalized Liouville (in fact, continuity) 
equation 

dtp = ^V-{fp) (28) 

where V ^ {d/dxi, d/dxn}- 

With a continuous probability density p = p{x,t), where x S i?" and we allow for an explicit time- 
dependence, we associate a respective differential entropy functional S{p), where in general S{p) = S{t) 
depends on time. 

Let us take for granted that an interchange of time derivative with an indefinite integral is allowed 
(suitable precautions are necessary with respect to the convergence of integrals). Then, we readily get 
an identity: 

J p{divf)dx^{\/ ■ f). (29) 

Accordingly, the information entropy S(t) grows with time only if the dynamical system has positive 
mean flow divergence. 

However, in general S is not positive definite. For example, dissipative dynamical systems are charac- 
terized by the negative (mean) flow divergence. Fairly often, the divergence of the flow is constant. Then, 
an "amount of information" carried by a corresponding statistical ensemble (e.g. its density) increases, 
which is paralleled by the information entropy decay (decrease). 

An example of a system with a point attractor (sink) at origin is a one-dimensional non-Hamiltonian 
system x — —x. In this case divf = —1 and S = —1. Further discussion of dynamical systems with strange 
(multifractal) attractors, for which the Shannon information (differential) entropy decreases indefinitely 
(the pertinent steady states are no longer represented by probability density functions) can be found in 
|12j . We note that for Hamiltonian systems, the phase-space flow has vanishing divergence, hence 5 = 
which implies that "information is conserved" in Hamiltonian dynamics. 

Let there be given an invertible dynamical system on R^, with f{x) = Fx, where F is a two-by two 
real matrix and x G R^, A solution has the form x{t) = exp(ti^)a;(0), where the matrix operator 
exp{tF) is defined through the standard Taylor expansion formula. The solution of the Liouville equation 
with an initial probability density /o(a;) is given by 

fix, t) = exp[-(irF)t] • /o(exp(-iF)a;) . (30) 

and hence: 

Sift) = Sifo) + itrF)t ^ Sift) = trF (31) 

Obviously TrF = Ai + A2, where Xi,i — 1,2 are the eigenvalues of F. Wc realize that Sift) grows 
indefinitely if trF > and diminishes indefinitely towards —00 if trF < 0. There is no stationary density 
and the conditional entropy is not defined. 
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4.2 Random system 

In case of a general dissipative dynamical system, a controlled admixture of noise can stabilize dynamics 
and yield asymptotic invariant densities. For example, an additive modification of the right-hand-side of 
Eq. (|2ZIl by white noise term A{t) where {A^{s)) = and {Ai{s)Aj{s')) = 2qd{s ~ s')^, i = l,2,...rt, 
implies the Fokker-Planck-Kramers equation: 

dtp ^ ~V ■ if p) + qAp (32) 

where A = = J^i^'^/^^l- Accordingly, the differential entropy dynamics would take another form 
than this defined by Eq. (|23l: 

S = J p{divf)dx + q J^iS/p)^dx. (33) 

Now, the dissipative term ( V • /) < can be counterbalanced by a strictly positive stabilizing contribution 
^X^i / dx. This allows to expect that, under suitable circumstances dissipative systems with 

noise may yield 5 = 0. If (V-/) > 0, then the differential (information) entropy would grow monotonically. 

We shall discuss an example of a non-invertible system, provided by the standard one-dimensional 
Ornstein-Uhlenbeck process, |l()l IH]. We choose the forward drift of the Fokker-Plack equation dtp — 
DAp + W[{jx)p] with 7 > and D > being the diffusion coefficient. 

If an initial density is chosen in the Gaussian form, with the mean value ao and variance CTq. the 
Fokker-Planck evolution preserves the Gaussian form of p{x,t) while modifying the mean value a(t) = 
aoexp(— 7i) and variance: 

cr2(t) = al exp(-27i) -I- —[1 - exp(-27t)] . (34) 
7 

Accordingly, since a unique invariant density has the form p» = ^^jlnD exp(— 7x^/21?) we obtain: 

2 

nc{t) = exp(-27i)7i,(po, P*) exp(-27t) (35) 

i.e. a monotonic growth of the negative- valued conditional KuUback-Leibler entropy towards its maximum 
at zero: 

2 

U,(t) = -27exp(-27t)7^,(po, P*) = 7'^ exp(-27t) > . (36) 
The differential entropy: 

S{t) ^ (1/2) \n[2TTe(j^{t)] (37) 



shows another temporal behavior 



27(J~7ag)exp(-27^) 



We observe that if (Jg > I?/7, then 5 < 0, while CTq < D implies 5 > 0. 

In both cases the behavior of the differential entropy is monotonic, although its growth or decay 
do critically rely on the choice of (Tq. Irrespective of Cg the asymptotic value of iS(t) as i ^ 00 reads 
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(1/2) ln[27re(£'/7)]- It is useful to note, that in the special case of CTq = D/^ the differential entropy is a 
constant of motion, while the conditional K-L entropy nonetheless does grow, asymptotically approaching 
the value zero according to Eq. H36|l . 

Summarizing, we can say that the conditional KuUback-Leibler entropy of the Ornstein-Uhlenbeck 
process grows monotonically in time, while the temporal behavior of the Gibbs-Shannon (differential) 
entropy depends on statistical properties (half- width ctq) of the initial ensemble density. This pattern of 
temporal behavior appears to be generic to a large class of dynamical systems, j^j- 

To find out whether there is anything deeper in the above apparent differences in the temporal 
behavior of the Gibbs-Shannon and KuUback-Leibler entropies associated with the same time-dependent 
probability density, except for the a priori presumed existence of the reference invariant density, let us 
consider the one-dimensional Fokker-Planck equation for any Smoluchowski process. We assume 

dtp = DAp - V{bp) (39) 

with a forward drift b = b{x, t) of the gradient form b — — V$ and attribute to a diffusion coefficient D 
dimensions of h/2m or ksT/mf]. 

Furthermore, we introduce the velocity fields: u{x,t) = DV In p(x,t) and v(x,t) = b{x,t) — u{x,t). 
The current velocity u(x, t), in view of dtp = — V(f p) which is an equivalent form of Eq. (|39|l . contributes 
to the diffusion current j — vp. 

For the differential entropy S{t) — — J p{x, t) In p[x, t) dx, while imposing boundary restrictions that 
p, vp, bp vanish at spatial infinities or finite interval borders, we readily get the entropy balance equation 
of the form Eq. H33|l . with the minor modification i. e. the replacement of q by D. We are however 
interested in its equivalent form (easily derivable under previously listed boundary restrictions), |10llll| : 



DS = (v^) -{b-v) . (40) 

Remembering that we deal with the Smoluchowski process, we set (adjusting dimensional constants): 
b = {D/ksT) F. Exploiting j = vp and demanding F = —\/V we infer: 

S=il/D){v')-Q (41) 

where the first (positive) term on the right-hand-side stands for the differential entropy accumulation 
rate (entropy gain by the system) . 

The second term contains the Q entry: 

Q = (l/kBT) J F-jdx^ (l/D) {b ■ v) (42) 

which , if positive (Q > is not a must, ^3), allows to interpret —Q as the entropy dissipation rate, i.e. 
an entropy transfer to the environment in the form of the surplus heat. Note that ksTQ — J F ■ j dx 
has a conspicuous from of the fairly standard power release expression i.e. the time rate at which the 
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mechanical work per unit of mass is returned back to the thermal reservoir (or absorbed if Q < 0) in the 
form of heat. 

Under current premises, there exists a stationary solution of the Fokker-Planck equation 

1 / V{x) \ 



p.(a.) = -exp^-^J (43) 

where Z = J cxp{—V{x)/kBT)dx. 

Let us take p*(x) as a reference density with respect to which the divergence of p{x,t) is quantified 
in terms of the conditional K-L entropy. Then: 



Hcit) = - y p In (^-^^ dx = S{t) -\nZ- 



(44) 



and straightforwardly, because of 



we arrive at 



^^{V) = -kBTQ (45) 



Wc = 5 + Q > . (46) 

At this point, we can come back to a continued discussion of the Ornstein-Uhlenbeck process. Namely, 
we have here a direct control of the behavior of the "power release" expression Q = — S. Since 

Tic = h^al/D) exp(-27i) > , (47) 

in case of »S < we encounter a continual power supply Q > by the thermal environment (alternatively, 
power absorption by the system). 

In case of <S > the situation is more complicated. For example, if ao = 0, we can easily check that 
Q < 0, i.e. we have the power drainage from the environment for all t e -R"*". More generally, the sign of 
Q is negative for < 2{D — 7(t§)/7. If the latter inequality is reversed, the sign of Q is not uniquely 
specified and suffers a change at a suitable time instant tchange{c(o, (^o)- 

Interestingly enough, in the special case of = Df-y i. e. <S = 0, we encounter 

Wc = Q > (48) 

i.e. a direct connection between the entropy increase and heat removal (to the thermostat) time rates, 
which counterbalance each other. 

4.3 Phase-space dynamics 

One may argue that the reported above, rather unexpected, insight into the nontrivial power transfer 
processes is an artifact of the one-dimensional spatial (Smoluchowski) projection of the phase-space 
motion. Let us therefore indicate arguments to the contrary. 

For Hamiltonian systems the phase-space flow is divergence-less. Indeed, let us consider a two- 
dimensional conservative system x = p/m and p = —W where H = p^ /2m + V{x). Obviously, divf = 
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which imphcs 5 = 0. In particular this extends to the standard harmonic oscillator with V{x) = 
{■muj^/2)x^. 

For the harmonic oscillator with friction, x = v, x — —(7/771)1; — {uj'^/m)x, we can adopt the ob- 
servations of subsection 3.1 with the two-by-two matrix F, whose first row contains only zeroes, while 
{F)2i — —uj'^/m, {F)22 ~ —7/771. Consequently trF — —7/7)1. 

A solution of the corresponding Liouville-type equation was discussed in subsection 4.1. The Gibbs- 
Shannon entropy evolves in time according to Eq. H31() : S{t) — 5(0) — {'yt)/m and S —00 as t — s- 00. 
Since 7 > 0, we have S = —7/771 < 0. There is no stationary density and hence no Ti.c{t). 

An admixture of noise in the velocity/momentum rate equation in the damped harmonic oscillator 

case allows for the existence of a stationary density. Let us consider, an example of the noisy 

damped harmonic oscillator: x ~ p/iii, p = ~{'y/m)p — {uj'^/m)p + ^(i) where the white noise term ^ 

is normalized as follows (^(t)) = 0, {£,it)£,{t')) — a5{t — t'). The corresponding Fokker-Planck-Kramers 

equation for the probability density f(x,v), with v = p/m is: 

df ^ djvf) ^ 1 d[{^v + L.^x)f] ^ dy ^^g^ 
dt dx m dv 2m? dv"^ 

and has a unique stationary solution: 

f^(^^y) = I^exp\~^{u;'x' + mv')] . (50) 

A detailed, in part computer-assisted, analysis of the temporal behavior of Gibbs-Shannon and condi- 
tional K-L entropies evaluated for density solutions of the above Kramers equation, with the initial data 

has been made in Ref. We shall summarize the outcomes of this investigation . 

In three basic regimes: overdamped 7^ > 4a;^, critical 7^ — AuP' and underdamped 7^ < AuP' cases, 
the conditional KuUback-Leibler entropy quantifies an approach of f{x,v,t) towards f^{x,v) in terms of 
the monotonic growth pattern (this statement includes also the case of Ti-dt) = 0). 

The situation is entirely different, if we consider the Gibbs-Shannon entropy of f(x,v,t). Let us 
denote cr» — a'^ /2juj'^ and — (j'^ ~ cr^, a^, — a'^ — us^a^. The behavior of S(€) sensitively depends on 
the mutual relations (signs, vanishing or non-vanishing of any or both etc.) between olx and ot^ and all 
details can be found in Ref. 

In the overdamped and critical cases, five independent temporal behaviors are admitted. First three 
are of the monotonic type, since S is vanishing, positive or negative. The fourth one admits a change of 
sign of S at certain to > from positive to negative plus the same scenario in reverse. The fifth temporal 
scenario shows a passage through iS-positive, negative and again positive stages of evolution plus the 
reverse (negative, positive, negative) option. 

The underdamped case shows even more intriguing patterns of temporal behavior. Namely, in addi- 
tion to the monotonic negative or positive signs of S we have also a conspicuous damped oscillation of 
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S{t), where S changes sign indefinitely, but an amplitude of oscillations performed by S(t) continually 
diminishes. 

All these diverse temporal patterns are special for the Gibbs-Shannon entropy. They are in turn 
accompanied by a unique pattern of the strictly monotonic growth (or none) Ti.c{t) > which is displayed 
by the conditional KuUback-Leibler entropy, 0. 

In close analogy with our considerations pertaining to the nontrivial power transfers between an open 
dynamical system and its thermal environment, c.f. subsection 4.2, let us notice that the invariant density 
Eq. H5U|) has the form analogous to this of Eq. (|43|l . Indeed, we have: 



with XjZ = (^^{jj^/m) 1(1x0^) and Eci{p,x) = /2m + V{x) with V{x) = lo^x^ is an energy of a classical 
harmonic oscillator at the [x,p — mv) phase-space point. 
Accordingly, we have: 



(52) 



Hcit) = - J f\n (^j-^ dxdv = S{t) -InZ- 



(53) 



where S{t) = — J fin fdxdv. Therefore, it is an intrinsic property of our dynamical system that H = 
S + Q > 0, where we define 

^ -^a (54) 

and clearly, Q is the direct analogue of the previously introduced power/heat transfer rate in the mean, 
c.f , Eqs. (gH and (Ell). 

5 Conclusions 

Standard notions of thermodynamical entropy are basically used under equilibrium or near-equilibrium 
conditions. The primary built-in concept is an equilibrium (steady) state and the behavior of entropy in 
the domain domain is seldom addressed. 

If one attempts to analyze a dynamics of an approach towards the prescribed steady state, it is 
necessary to pass to the time domain where the non-equilibrium and often rapid dynamical processes 
take place. Various notions of entropy may be designed to quantify such non-equilibrium phenomena. 

Our analysis of simple diffusion-type models indicates that the very notion of entropy, except perhaps 
for the standard Clausius thermodynamical entropy, is non-universal and purpose-dependent. In partic- 
ular, the conditional KuUback-Leibler entropy is regarded (in reference to the "purpose") to be the only 
valid entropy growth justification in terms of model systems, (SUHIj (that in conformity with the standard 
interpretation of the second law of thermodynamics for closed systems). 

However, a deeper insight into the underlying physical phenomena (power/heat transfer processes in 
the mean) is available only through the differential (Gibbs-Shannon) entropy, whose temporal behavior 
is generically inconsistent with the "entropy growth" pattern. Moreover, the Gibbs-Shannon entropy 
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balance equation contains the conditional KuUback-Leibler entropy time rate as an explicit non-negative 
"entropy production" or rather "entropy accumulation" term, see e.g. subsections 4.2 and 4.3. The 
entropy dissipation may proceed through the previously mentioned mean power transfer mechanism, 
however the involved "heat transfer" expression Q is not necessarily positive-definite. 

The conditional KuUback-Leibler entropy is an appropriate tool in case of "slow" processes, and in 
the asymptotic (large) time regime. The Gibbs-Shannon (differential, information) entropy is perfectly 
suited for the " shortest description length analysis" , in particular for the study of rapid changes in time 
of the probability distribution involved. 
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